Pseudo-periodic partitions of biological sequences

نویسندگان

  • Lugang Li
  • Renchao Jin
  • Poh-Lin Kok
  • Honghui Wan
چکیده

MOTIVATION Algorithm development for finding typical patterns in sequences, especially multiple pseudo-repeats (pseudo-periodic regions), is at the core of many problems arising in biological sequence and structure analysis. In fact, one of the most significant features of biological sequences is their high quasi-repetitiveness. Variation in the quasi-repetitiveness of genomic and proteomic texts demonstrates the presence and density of different biologically important information. It is very important to develop sensitive automatic computational methods for the identification of pseudo-periodic regions of sequences through which we can infer, describe and understand biological properties, and seek precise molecular details of biological structures, dynamics, interactions and evolution. RESULTS We develop a novel, powerful computational tool for partitioning a sequence to pseudo-periodic regions. The pseudo-periodic partition is defined as a partition, which intuitively has the minimal bias to some perfect-periodic partition of the sequence based on the evolutionary distance. We devise a quadratic time and space algorithm for detecting a pseudo-periodic partition for a given sequence, which actually corresponds to the shortest path in the main diagonal of the directed (acyclic) weighted graph constructed by the Smith-Waterman self-alignment of the sequence. We use several typical examples to demonstrate the utilization of our algorithm and software system in detecting functional or structural domains and regions of proteins. A big advantage of our software program is that there is a parameter, the granularity factor, associated with it and we can freely choose a biological sequence family as a training set to determine the best parameter. In general, we choose all repeats (including many pseudo-repeats) in the SWISS-PROT amino acid sequence database as a typical training set. We show that the granularity factor is 0.52 and the average agreement accuracy of pseudo-periodic partitions, detected by our software for all pseudo-repeats in the SWISS-PROT database, is as high as 97.6%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Pseudo-Periodic Gene Expression Profiles

Time-course gene expression profiles associated with periodic biological processes should appear periodic. However, because of inherit problems with the experimental protocols measured gene expression data are actually pseudo-periodic, not exactly periodic. Therefore, identifying pseudo-periodically expressed gene from their time-course data could help understand the molecular mechanism of peri...

متن کامل

Existence of Pseudo Almost Periodic Solutions to Some Classes of Partial Hyperbolic Evolution Equations

The paper examines the existence of pseudo almost periodic solutions to some classes of partial hyperbolic evolution equations. Namely, sufficient conditions for the existence and uniqueness of pseudo almost periodic solutions to those classes of hyperbolic evolution equations are given. Applications include the existence of pseudo almost periodic solutions to the transport and heat equations w...

متن کامل

On some properties and applications of Horadam sequences

The Horadam sequence is a generalization of the Fibonacci numbers in the complex plane, depending on a family of four complex parameters: two recurrence coefficients and two initial conditions. The necessary and sufficient periodicity conditions formulated in [1] are used to enumerate all Horadam sequences with a given period [2]. The geometry of periodic orbits is analyzed, where regular star-...

متن کامل

Analysis of Pseudo-Turbulence Flow Induced by Bubble Periodic Formation in Non-Newtonian Fluids

Laser Doppler Velocimetry (LDV) has been employed to determine pseudo-turbulence characteristics of the flow field around bubble train forming in non-Newtonian caboxymethylcellulose (CMC) aqueous solution at low gas flow rate condition. The Reynolds stress and turbulent intensity of the liquid were investigated by means of Reynolds time-averaged method. The experimental results show that ax...

متن کامل

Existence of Weighted Pseudo Almost Periodic Mild Solutions for Nonlocal Semilinear Evolution Equations

In this paper, we are concerned with new weighted pseudo almost periodic solutions of the semilinear evolution equations with nonlocal conditions x′(t) = A(t)x(t) + f(t, x(t)), x(0) = x0 + g(x), t ∈ R. By applying the Banach fixed point theorem, the theory of the measure theory, the theory of semigroups of operators to evolution families and the properties of a class of new weighted pseudo almo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 20 3  شماره 

صفحات  -

تاریخ انتشار 2004